forked from qemu/qemu
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
q35: add virtio-mmio slots #13
Draft
osteffenrh
wants to merge
82
commits into
coconut-svsm:svsm-igvm
Choose a base branch
from
osteffenrh:virtio-pr
base: svsm-igvm
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pc_machine_kvm_type() was introduced by commit e21be72 ("i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode") to do Xen specific initialization by utilizing kvm_type method. commit eeedfe6 ("hw/xen: Simplify emulated Xen platform init") moves the Xen specific initialization to pc_basic_device_init(). There is no need to keep the PC specific kvm_type() implementation anymore. On the other hand, later patch will implement kvm_type() method for all x86/i386 machines to support KVM_X86_SW_PROTECTED_VM. Signed-off-by: Xiaoyao Li <[email protected]> Reviewed-by: Isaku Yamahata <[email protected]> Reviewed-by: David Hildenbrand <[email protected]>
Decrease array index cpuid_i when CPUID leaf 1F is skipped, otherwise it will get an all zero'ed CPUID entry with leaf 0 and subleaf 0. It conflicts with correct leaf 0. Signed-off-by: Xiaoyao Li <[email protected]>
No such constraint that subleaf index needs to be less than 64. Signed-off-by: Xiaoyao Li <[email protected]>
…ry() The upper 16 bits of kvm_userspace_memory_region::slot are address space id. Parse it separately in trace_kvm_set_user_memory(). Signed-off-by: Xiaoyao Li <[email protected]>
This patch needs to be updated by script scripts/update-linux-headers.sh once gmem fd support is upstreamed in Linux kernel. Signed-off-by: Xiaoyao Li <[email protected]>
Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock. Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to create private guest_memfd during RAMBlock setup. Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of confidential guests, such as TDX VM. How and when to set it for memory backends will be implemented in the following patches. Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has KVM guest_memfd allocated. Signed-off-by: Xiaoyao Li <[email protected]> --- Changes in v3: - rename gmem to guest_memfd; - close(guest_memfd) when RAMBlock is released; (Daniel P. Berrangé) - Suqash the patch that introduces memory_region_has_guest_memfd().
KVM allows KVM_GUEST_MEMFD_ALLOW_HUGEPAGE for guest memfd. When the flag is set, KVM tries to allocate memory with transparent hugeapge at first and falls back to non-hugepage on failure. However, KVM defines one restriction that size must be hugepage size aligned when KVM_GUEST_MEMFD_ALLOW_HUGEPAGE is set. Signed-off-by: Xiaoyao Li <[email protected]> --- v3: - New one in v3.
Add a new member "require_guest_memfd" to memory backends. When it's set to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm guest_memfd will be allocated during RAMBlock allocation. Memory backend's @require_guest_memfd is wired with @require_guest_memfd field of MachineState. MachineState::require_guest_memfd is supposed to be set by any VMs that requires KVM guest memfd as private memory, e.g., TDX VM. Signed-off-by: Xiaoyao Li <[email protected]>
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM. With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that backend'ed both by hva-based shared memory and guest memfd based private memory. Signed-off-by: Chao Peng <[email protected]> Co-developed-by: Xiaoyao Li <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>
Introduce the helper functions to set the attributes of a range of memory to private or shared. This is necessary to notify KVM the private/shared attribute of each gpa range. KVM needs the information to decide the GPA needs to be mapped at hva-based shared memory or guest_memfd based private memory. Signed-off-by: Xiaoyao Li <[email protected]>
…card_range() Commit d3a5038 ("exec: ram_block_discard_range") introduced ram_block_discard_range() which grabs some code from ram_discard_range(). However, during code movement, it changed alignment check of host_startaddr from qemu_host_page_size to rb->page_size. When ramblock is back'ed by hugepage, it requires the startaddr to be huge page size aligned, which is a overkill. e.g., TDX's private-shared page conversion is done at 4KB granularity. Shared page is discarded when it gets converts to private and when shared page back'ed by hugepage it is going to fail on this check. So change to alignment check back to qemu_host_page_size. Signed-off-by: Xiaoyao Li <[email protected]> --- Changes in v3: - Newly added in v3;
…ge() Use __func__ to avoid hard-coded function name. Signed-off-by: Xiaoyao Li <[email protected]>
It's used for discarding opposite memory after memory conversion, for confidential guest. When page is converted from shared to private, the original shared memory can be discarded via ram_block_discard_range(); When page is converted from private to shared, the original private memory is back'ed by guest_memfd. Introduce ram_block_discard_guest_memfd_range() for discarding memory in guest_memfd. Originally-from: Isaku Yamahata <[email protected]> Codeveloped-by: Xiaoyao Li <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>
Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when KVM_EXIT_MEMORY_FAULT happens. It indicates userspace needs to do the memory conversion on the RAMBlock to turn the memory into desired attribute, i.e., private/shared. Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has guest_memfd memory backend. Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is added. Signed-off-by: Chao Peng <[email protected]> Co-developed-by: Xiaoyao Li <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>
Signed-off-by: Isaku Yamahata <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>
Implement mc->kvm_type() for i386 machines. It provides a way for user to create SW_PROTECTE_VM. Also store the vm_type in machinestate to other code to query what the VM type is. Signed-off-by: Xiaoyao Li <[email protected]>
TDX VM requires VM type KVM_X86_TDX_VM to be passed to kvm_ioctl(KVM_CREATE_VM). If tdx-guest object is specified to confidential-guest-support, like, qemu -machine ...,confidential-guest-support=tdx0 \ -object tdx-guest,id=tdx0,... it parses VM type as KVM_X86_TDX_VM. Signed-off-by: Xiaoyao Li <[email protected]>
Introduce a separate function kvm_confidential_guest_init(), which dispatches specific confidential guest initialization function by ms->cgs type. Signed-off-by: Xiaoyao Li <[email protected]> Acked-by: Gerd Hoffmann <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
…ivate value Introduce new flag RAM_DEFAULT_PRIVATE for RAMBlock. It's used to indicate the default attribute, private or not. Set the RAM range to private explicitly when it's default private. Originated-from: Isaku Yamahata <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>
Introduce memory_region_init_ram_guest_memfd() to allocate private guset memfd on the MemoryRegion initialization. It's for the use case of TDVF, which must be private on TDX case. Signed-off-by: Xiaoyao Li <[email protected]>
In mch_realize(), process PAM initialization before SMRAM initialization so that later patch can skill all the SMRAM related with a single check. Signed-off-by: Isaku Yamahata <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>
Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG, etc... exist for the target platform. TDX doesn't support SMM and doesn't play nice with QEMU modifying related guest memory ranges. Signed-off-by: Isaku Yamahata <[email protected]> Co-developed-by: Sean Christopherson <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>
This reverts commit a1a6b66aa83d26bab1628b92b7720d5e4b672e7e.
TODO: squash into: kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Currently all SEV/SEV-ES functionality is managed through a single 'sev-guest' QOM type. With upcoming support for SEV-SNP, taking this same approach won't work well since some of the properties/state managed by 'sev-guest' is not applicable to SEV-SNP, which will instead rely on a new QOM type with its own set of properties/state. To prepare for this, this patch moves common state into an abstract 'sev-common' parent type to encapsulate properties/state that is common to both SEV/SEV-ES and SEV-SNP, leaving only SEV/SEV-ES-specific properties/state in the current 'sev-guest' type. This should not affect current behavior or command-line options. As part of this patch, some related changes are also made: - a static 'sev_guest' variable is currently used to keep track of the 'sev-guest' instance. SEV-SNP would similarly introduce an 'sev_snp_guest' static variable. But these instances are now available via qdev_get_machine()->cgs, so switch to using that instead and drop the static variable. - 'sev_guest' is currently used as the name for the static variable holding a pointer to the 'sev-guest' instance. Re-purpose the name as a local variable referring the 'sev-guest' instance, and use that consistently throughout the code so it can be easily distinguished from sev-common/sev-snp-guest instances. - 'sev' is generally used as the name for local variables holding a pointer to the 'sev-guest' instance. In cases where that now points to common state, use the name 'sev_common'; in cases where that now points to state specific to 'sev-guest' instance, use the name 'sev_guest' Signed-off-by: Michael Roth <[email protected]>
SEV-SNP support relies on a different set of properties/state than the existing 'sev-guest' object. This patch introduces the 'sev-snp-guest' object, which can be used to configure an SEV-SNP guest. For example, a default-configured SEV-SNP guest with no additional information passed in for use with attestation: -object sev-snp-guest,id=sev0 or a fully-specified SEV-SNP guest where all spec-defined binary blobs are passed in as base64-encoded strings: -object sev-snp-guest,id=sev0, \ policy=0x30000, \ init-flags=0, \ id-block=YWFhYWFhYWFhYWFhYWFhCg==, \ id-auth=CxHK/OKLkXGn/KpAC7Wl1FSiisWDbGTEKz..., \ auth-key-enabled=on, \ host-data=LNkCWBRC5CcdGXirbNUV1OrsR28s..., \ guest-visible-workarounds=AA==, \ See the QAPI schema updates included in this patch for more usage details. In some cases these blobs may be up to 4096 characters, but this is generally well below the default limit for linux hosts where command-line sizes are defined by the sysconf-configurable ARG_MAX value, which defaults to 2097152 characters for Ubuntu hosts, for example. Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>
When SEV-SNP is enabled, the KVM_SNP_INIT command is used to initialize the platform. The command checks whether SNP is enabled in the KVM, if enabled then it allocates a new ASID from the SNP pool and calls the firmware to initialize the all the resources. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>
The SNP_LAUNCH_START is called first to create a cryptographic launch context within the firmware. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>
The KVM_SEV_SNP_LAUNCH_UPDATE command is used for encrypting the bios image used for booting the SEV-SNP guest. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>
Test kernel now uses hugepages by default. Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
SEV uses these to pin pages, SNP doesn't need this since it relies on gmem for any private pages. Signed-off-by: Michael Roth <[email protected]>
The IGVM library allows Independent Guest Virtual Machine files to be parsed and processed. IGVM files are used to configure guest memory layout, initial processor state and other configuration pertaining to secure virtual machines. This adds the --enable-igvm configure option, enabled by default, which attempts to locate and link against the IGVM library via pkgconfig and sets CONFIG_IGVM if found. The library is added to the system_ss target in backends/meson.build where the IGVM parsing will be performed by the ConfidentialGuestSupport object. Signed-off-by: Roy Hopkins <[email protected]>
In order to add support for parsing IGVM files for secure virtual machines, a the path to an IGVM file needs to be specified as part of the guest configuration. It makes sense to add this to the ConfidentialGuestSupport object as this is common to all secure virtual machines that potentially could support IGVM based configuration. This patch allows the filename to be configured via the QEMU object model in preparation for subsequent patches that will read and parse the IGVM file. Signed-off-by: Roy Hopkins <[email protected]>
In preparation for supporting the processing of IGVM files to configure guests, this adds a set of functions to ConfidentialGuestSupport allowing configuration of secure virtual machines that can be implemented for each supported isolation platform type such as Intel TDX or AMD SEV-SNP. These functions will be called by IGVM processing code in subsequent patches. This commit provides a default implementation of the functions that either perform no action or generate a warning or error when they are called. Targets that support ConfidentalGuestSupport should override these implementations. Signed-off-by: Roy Hopkins <[email protected]>
This commit adds an implementation of an IGVM loader which parses the file specified as a pararameter to ConfidentialGuestSupport and provides a function that uses the interface in the same object to configure and populate guest memory based on the contents of the file. The IGVM file is parsed when a filename is provided but the code to process the IGVM file is not yet hooked into target systems. This will follow in a later commit. Signed-off-by: Roy Hopkins <[email protected]>
An IGVM file contains configuration of a guest that supports confidential computing hardware. As part of the PC system initialization, the IGVM needs to be processed to apply this configuration before the guest is started. This patch introduces processing of a provided IGVM file at the end of the current PC initialization steps. If an IGVM file has been provided then the directives in the file are processed completing the initialization of the target. If no IGVM file has been specified by the user then no there is no intended consequences in these changes. Signed-off-by: Roy Hopkins <[email protected]>
…h IGVM When using an IGVM file the configuration of the system firmware is defined by IGVM directives contained in the file. In this case the user should not configure any pflash devices. This commit skips initialization of the ROM mode when pflash0 is not set then checks to ensure no pflash devices have been configured when using IGVM, exiting with an error message if this is not the case. Signed-off-by: Roy Hopkins <[email protected]>
The segment registers should be updated via a call to cpu_x86_load_seg_cache(). However, this only works for the general purpose segment registers and not LDTR or TR. In these cases, the function call results in an out-of-bounds access on the env->segs array. Fortunately the out-of-bounds access actually overflows into the correct segment registers. This commit fixes the handling of those segment registers. Signed-off-by: Roy Hopkins <[email protected]>
When an SEV guest is started, the reset vector and state are extracted from metadata that is contained in the firmware volume. In preparation for using IGVM to setup the initial CPU state, the code has been refactored to populate vmcb_save_area for each CPU which is then applied during guest startup and CPU reset. Signed-off-by: Roy Hopkins <[email protected]>
The ConfidentialGuestSupport object defines a number of virtual functions that are called during processing of IGVM directives to query or configure initial guest state. In order to support processing of IGVM files, these functions need to be implemented by relevant isolation hardware support code such as SEV. This commit implements the required functions for SEV-ES and adds support for processing IGVM files for configuring the guest. Signed-off-by: Roy Hopkins <[email protected]>
IGVM support has been implemented for Confidential Guests that support AMD SEV and AMD SEV-ES. Add some documentation that gives some background on the IGVM format and how to use it to configure a confidential guest. Signed-off-by: Roy Hopkins <[email protected]>
When sev_get_capabilities() is called on non-SEV platforms, an error is correctly generated stating it is not supported. The function should return NULL at this point but instead continues to further check for supported features, subsequently causing a crash. This commit adds the missing NULL return. Signed-off-by: Roy Hopkins <[email protected]>
For confidential guests a policy can be provided that defines the security level, debug status, expected launch measurement and other parameters that define the configuration of the confidential platform. This commit adds a new function named set_guest_policy() that can be implemented by each confidential platform, such as AMD SEV to set the policy. This will allow configuration of the policy from a multi-platform resource such as an IGVM file without the IGVM processor requiring specific implementation details for each platform. Signed-off-by: Roy Hopkins <[email protected]>
The initialization secions in IGVM files contains configuration that should be applied to the guest platform before it is started. This includes guest policy and other information that can affect the security level and measurement of a confidential guest. This commit refactors the existing section processing code to prepare it for handling initialization sections as well as directives and iterates the initialization sections in the IGVM file. Signed-off-by: Roy Hopkins <[email protected]>
Adds a handler for the guest policy initialization IGVM section and builds an SEV policy based on this information and the ID block directive if present. The policy is applied using by calling 'set_guest_policy()' on the ConfidentialGuestSupport object. Signed-off-by: Roy Hopkins <[email protected]>
The new cgs_set_guest_policy() function is provided to receive the guest policy flags, SNP ID block and SNP ID authentication from guest configuration such as an IGVM file and apply it to the platform prior to launching the guest. The policy is used to populate values for the existing 'policy', 'id_block' and 'id_auth' parameters. When provided, the guest policy is applied and the ID block configuration is used to verify the launch measurement and signatures. The guest is only successfully started if the expected launch measurements match the actual measurements and the signatures are valid. Signed-off-by: Roy Hopkins <[email protected]>
This fixes the handling of return codes from the KVM_SEV_SNP_LAUNCH_FINISH ioctl. The ioctl returns a success code if the command was successfully sent to the SEV PSP regardless of whether the command itself was successful or not. Any error from the PSP is returned instead via the '&error' parameter. This commit checks 'error' to see if the processing resulted in an error and reports it as required. Signed-off-by: Roy Hopkins <[email protected]>
The synchronization between VMSA and CPU state did not include the GDT, IDT, LDTR or TR registers. These registers had default, non-zero values, therefore if the VMSA did not match these values then a difference in launch measurement is observed. This change adds synchronization of these registers and adds them to the VMSA field check. Signed-off-by: Roy Hopkins <[email protected]>
Fix compile error due to no return statement in function returning non-void. Signed-off-by: Vijay Dhanraj <[email protected]>
Fix compile error when built without --enable-igvm
Implement IGVM policy and ID block support
Disabled by default, enable using '-machine q35,x-svsm-virtio-mmio=on'. When enabled it is possible to plug up to 4 virtio devices into the slots virtio-mmio using '-device virtio-${kind}-device'. The devices can be found at base address 0xfef00000, each slot on a separate page. No IRQ is wired up, the svsm has to drive the devices in polling mode.
8 tasks
joergroedel
force-pushed
the
svsm-igvm
branch
from
December 17, 2024 11:38
260df2b
to
bcb7121
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Disabled by default, enable using '-machine q35,x-svsm-virtio-mmio=on'.
When enabled it is possible to plug up to 4 virtio devices into the slots virtio-mmio using '-device virtio-${kind}-device'.
The devices can be found at base address 0xfef00000, each slot on a separate page. No IRQ is wired up, the svsm has to drive the devices in polling mode.